Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp the end-of-test summary #4089

Open
wants to merge 49 commits into
base: master
Choose a base branch
from

Conversation

joanlopez
Copy link
Contributor

@joanlopez joanlopez commented Dec 4, 2024

Overview

This pull request changes the current end-of-test summary by a new design, with two available formats:

  • a) compact (default)
  • b) full
    aiming to bring clearer a more valuable results to users. Find a screenshot below.

User-facing details

  • 💹 The appearance of the end-of-test summary is now different, with thresholds at the beginning (instead on being inlined with metrics), checks slightly modified and metrics grouped by category (http, ws, network, etc).
  • 🔠 The user can choose between:
    • the compact summary with --with-summary=compact, or with no argument, as it is the default choice.
    • the full summary with --with-summary=full.
    • the legacy summary with --with-summary=legacy.
  • ⚠️ The data model passed into the custom handleSummary function is now different. So, those users relying on it must migrate their implementation or use --with-summary=legacy meanwhile.
    • 🙏🏻 (Dear reviewer) If you definitely think we should keep the old format, I can give it a try and see if I can accommodate the new to it before calling the function, without many extra allocations (I guess that since we don't propagate sinks here it should be fine in general). But please, comment it explicitly, exposing the reasons and what's your concrete proposal.

Technical details (for review purposes)

  • The core logic of the new end of test summary, and how it collects metrics, etc is based on a new output.Output named summary.
  • There's a new test script as example under internal/cmd/testdata/summary/... with different scenarios, groups, thresholds, custom metrics and what not... that can be used for both automated and manual testing. If you think anything is missing, just suggest it.
  • The JS code responsible for rendering the summary have been largely refactored and type-documented, 👏🏻 big shot-out to @oleiade, aiming to make that code way more maintainable that it has been until now. I guess that, once we merge this PR, we may need to copy-past it back to https://github.com/grafana/k6-jslib-summary.
  • I left two data structures for the summary representation (lib.Summary vs lib.LegacySummary), to keep support for the legacy summary for some time, in an easy way, so things aren't complexity mixed and the the clean up in the future is simpler, just by removing that type, all the references to it, and simplifying the few conditionals that behave depending on which summary type is provided.
    • Similarly, I left the old JS code for the summary as the summary-legacy.js, for simpler cleanup whenever we remove that support, which I guess might be for v2 (once we ship the formalized JSON output format within v1).

Internal Checklist

Before review readiness

  • Add support for a compact summary mode (likely enabled by default, so perhaps a "extended" flag, to avoid memory allocation issues), that only "relays" metrics but not stores metrics for groups and scenarios.
  • Revisit sorting: is there anything else we can sort for this first iteration (note that some ideas have been moved to a second iteration, see below)
  • Take a decision on what do we want to do with the existing summary, whether we want to use the new lib.Report as the new API for custom handleSummary (note this would be a breaking change), or if we want to ship this progressively.
    • Review and modify (if needed) the code in js/summary.go according to that decision.
  • Review and refactor the code in js/summary.js:
    • All the pieces from the old summary can probably be removed, if not removed yet.
    • De-duplicate the functions summarizeMetrics and summarizeMetricsWithThresholds, or just replace the first with the second one (as we may no longer need the first one if we remove the old summary code).
  • Review the structure and code of the output/summary package:
  • Verify all scenarios (different test scripts) work as expected (test with no groups nor scenarios, test with only groups, test with only scenarios, test with the combination of both, all of them with compact mode enabled or not).
  • Make the CI look 🟢 as an 🍏 again!
  • (Optional) Define the JSDoc for the newly introduced functions in JS code.
  • Remove the playground/full-summary files, or define a proper location for them.
    --- Ideas left for a second iteration---
  • Re-write the JS code in TS.
  • Sorting the thresholds by metric name, sorting the tags within the metric name and perhaps sorting the sources.
  • Implementing a more optimized data structure (as explored in https://github.com/joanlopez/xk6-custosummary/tree/main/timeseries) to use less memory allocations while keeping scenarios & groups samples.

General

  • I have performed a self-review of my code.
  • I have added tests for my changes.
  • I have run linter locally (make lint) and all checks pass.
  • I have run tests locally (make tests) and all tests pass.
  • I have commented on my code, particularly in hard-to-understand areas.

@joanlopez joanlopez requested a review from a team as a code owner December 4, 2024 17:58
@joanlopez joanlopez requested review from mstoykov and olegbespalov and removed request for a team December 4, 2024 17:58
@joanlopez joanlopez marked this pull request as draft December 4, 2024 17:58
@oleiade oleiade force-pushed the new-end-of-test-summary-output branch from 58138ac to 126a188 Compare December 17, 2024 15:58
@joanlopez joanlopez force-pushed the new-end-of-test-summary-output branch from 1904999 to 4bb8f7a Compare February 7, 2025 21:11
@joanlopez joanlopez force-pushed the new-end-of-test-summary-output branch from 4bb8f7a to 9c1ed70 Compare February 7, 2025 21:14
@joanlopez joanlopez marked this pull request as ready for review February 7, 2025 21:59
@olegbespalov olegbespalov added the breaking change for PRs that need to be mentioned in the breaking changes section of the release notes label Feb 13, 2025
Copy link
Contributor

@olegbespalov olegbespalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass on that, but need a couple of more for sure, commenting just signaling that I'm on it

internal/js/runner.go Outdated Show resolved Hide resolved
lib/models.go Show resolved Hide resolved
@olegbespalov olegbespalov self-requested a review February 13, 2025 16:59
@joanlopez
Copy link
Contributor Author

I did a pass on that, but need a couple of more for sure, commenting just signaling that I'm on it

Sure, thanks! Take your time!

@olegbespalov
Copy link
Contributor

full aiming to bring clearer a more valuable results to users. Find a screenshot below.

Where can I find the screenshot for that? 🤔

I've tried to run in two modes and not sure if I see any difference

image

output/summary/summary.go Outdated Show resolved Hide resolved
internal/cmd/run.go Outdated Show resolved Hide resolved
internal/cmd/testdata/summary/browser.js Outdated Show resolved Hide resolved
internal/js/summary.go Outdated Show resolved Hide resolved
internal/lib/testutils/minirunner/minirunner.go Outdated Show resolved Hide resolved
}

// NewSummary instantiates a new empty Summary.
func NewSummary() *Summary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of the curiosity, why have we decided to use the empty summary constructor? I mean, why don't we ask require the mandatory values throw the constructor (and maybe even apply validation)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have a concrete answer, to be honest. I think the reason why I followed this approach is because it's mostly a DTO, and concretely a recursive one, so it felt easier to initialize it empty, like when you initialize a map, and populate it on the go, instead of asking for the inner data in the constructor.

Most of the logic is on the summary.Output side, but I preferred not to couple both, even if that's the main use, at least for now.

output/summary/data.go Show resolved Hide resolved
@joanlopez
Copy link
Contributor Author

full aiming to bring clearer a more valuable results to users. Find a screenshot below.

Where can I find the screenshot for that? 🤔

I've tried to run in two modes and not sure if I see any difference

image

The key difference between full and compact modes is that the former also displays partial results for groups and scenarios, while the latter only displays total results. However, if there are no groups nor scenarios, then their appearance is the same.

Do you have any other suggestion to differentiate among them?
cc/ @oleiade do you have any other idea? Perhaps hiding some data? 🤔

To be fully transparent, I don't have a strong opinion here, but I'd advocate to either make any change w'all fully agree, or move forward as-is, to avoid looping in cycles. As far as I know, we offer no guarantee on the text summary format, so it should be fine to iterate it in the near future if needed. The one shipped as part of this PR doesn't need to be the definitive one before and along the v1.

@joanlopez joanlopez added this to the v1.0.0-rc1 milestone Feb 18, 2025
@joanlopez joanlopez force-pushed the new-end-of-test-summary-output branch from d2511c3 to b8ed6e0 Compare February 18, 2025 15:44
@olegbespalov
Copy link
Contributor

@joanlopez like I said internally, I'm totally fine with processing as it's with one exception that it's worth adjusting texting when it lands documentation

full - aiming to bring clearer a more valuable results to users. Find a screenshot below.

IMO too generic and vague and in my case it was the source of confusion since I expected to see these more valuable results since the beginning

@joanlopez
Copy link
Contributor Author

@joanlopez like I said internally, I'm totally fine with processing as it's with one exception that it's worth adjusting texting when it lands documentation

full - aiming to bring clearer a more valuable results to users. Find a screenshot below.

IMO too generic and vague and in my case it was the source of confusion since I expected to see these more valuable results since the beginning

Nice, thanks for your input @olegbespalov!
Let's wait for @oleiade's input, but yeah, I'll definitely take that into consideration when writing the corresponding docs.

@oleiade
Copy link
Member

oleiade commented Feb 19, 2025

Hey @joanlopez @olegbespalov 👋🏻

Apologies for the delay and being blocking here. I overlooked the compact vs full list of metrics during the last phases of the design, but doing a bit of digging into some of our initial design docs, I found back that we had come up with a candidate list of metrics to exclude in compact mode (and include in full/extended mode):

We exclude the following metrics from default results:
http_req_blocked
http_req_connecting
http_req_receiving
http_req_sending
http_req_tls_handshaking
http_req_waiting

The rationale there, being that we wanted to only show what is relevant to the vast majority of users in compact mode, and only bring them back when users explicit request the full mode. In general I remember that we discussed focusing on removing things that would only be relevant to a very small portion of users by default, or to very specific use-cases.

At the time we outlined only HTTP metrics, because that's where we thought the most of those cases were, but I think we should feel free to expand that list if we can consider other metrics as "not absolutely mandatory".

I also agree with @olegbespalov in that we are explicitly excluding the end of test summary from the v1.Y.Z support policy, and we should feel free to iterate on it in the future. So we don't have to block on this if there are diverging ideas about what the list of included/excluded metrics would be for instance: even if we kept the list at what it is now, that would be 👍🏻 for me 🙇🏻

Hope that's helpful, and again, great work @joanlopez I love it ❤️ 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change for PRs that need to be mentioned in the breaking changes section of the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants